Return probabilities and hitting times of random walks on sparse Erdos-Renyi graphs 
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We consider random walks on random graphs, focusing on return probabilities and hitting times 
for sparse Erdos-Renyi graphs. We show how to solve for the distribution of these quantities in the 
thermodynamic limit and we find that these distributions exhibit structures on all scales. 
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I. Introduction 

Random walks are some of the simplest stochastic pro- 
cesses [l], [|| and yet they arise in many scientific fields 
such as pure mathematics, statistical physics or even bi- 
ology 0, 0, H, @ . A fundamental quantity for computing 
properties of random walks is the first passage time @, [1] • 
Consider a random walk on a graph G, starting at node 
s; given another arbitrary node t (the target), the hitting 
time H (s, t) is just the mean of the first passage time to 
go from s to t. There is a well known relation between the 
value of H(s, t) averaged over all nodes t of the graph and 
the spectrum of its adjacency matrix as derived in Q. 

In this work we focus on random graphs 0, [To| . For 
dense Erdos-Renyi graphs [11| , the spectrum of the dif- 
fusion operator converges to that of a Gaussian random 
matrix and one can show [l2l [l3j that if N is the number 
of nodes of G, the hitting time is N + o(N). As far as we 
know, there is no analogous result for sparse graphs: only 
a mean-field approximation has been derived [14J which 
neglects certain fluctuations. This situation is surprising 
because the problem has been open for many years, but 
the lack of progress underlies the difficulty of deriving an- 
alytically the spectrum of the adjacency matrix on sparse 
random graphs [HI, [Hj]. Nevertheless, we show here how 
to bypass this difficulty by exploiting the local structure 
of sparse random graphs which is tree-like with proba- 
bility 1 at large N. We thus map the N — > oo problem 
to a diffusion process on random trees, thereby providing 
an analytical calculation for the hitting times and for a 
closely related quantity, the probability that the walker 
returns to its starting node in a finite time. 

In what follows, we first specify the stochastic dynam- 
ics of the random walk and the kinds of random graphs we 
use. After we compute the hitting times and probabilities 
of return on random d-regular graphs [TtJ • That calcu- 
lation is then generalized to sparse Erdos-Renyi graphs, 
displaying quite subtle distributions. 

II. The model 

We consider a random walker on a graph G. At each 
time step n, the walker hops to one of the neighboring 
nodes, all such nodes being equi-probable. It is conve- 
nient to introduce the adjacency matrix A of G: A$j = 1 
if nodes i and j are connected by an edge and Ay = 
otherwise. Defining at each time step n the probabil- 
ity of having the walker be at node i, the vector of 



probabilities obeys the master equation 
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where the sum is taken over all nodes j that are adjacent 
to the node i. The matrix D is diagonal; its i-th diagonal 
element Da is equal to the degree di of the i-th node. 

To investigate the hitting time of the walker to go from 
s to t, it is enough to initialize the vector v^ ) to be zero 
on all nodes except at s where it is 1, and to impose 
absorbing conditions at the target node t, i.e., v^™' = 
at all n. Then the probability of having a first passage 
time equal to n is given by the flux into node t at that 
time step Q • A modified treatment of the walker allows 
one to obtain the probability of return to s. 

Our mathematical solution concerns Erdos-Renyi 
graphs in the ensemble G(N,p), where N is the total 
number of nodes and each pair of nodes has probability 
p to be connected by an edge. For sparse graphs, p = c/N 
where c = (d) is the mean degree of nodes. We shall also 
consider fixed degree random graphs, also called random 
d-regular graphs, where each node has exactly degree d 
but connections are otherwise random (l7j . 

III. Hitting times on random d-regular graphs 

Let us first compute the hitting time on random regular 
graphs, exploiting their local tree- like nature. Indeed, 
loops can arise in random d-regular graphs [l7j but their 
typical length is 0(ln(N)). Thus it is expected that most 
properties can be obtained by studying what happens 
locally, as long as boundary conditions at "infinity" are 
properly handled. Such an approach has been used in 
many contexts with a high level of success [HI, El ■ 

For a given random regular graph, of fixed degree d, 
we consider a node t and ask what is the mean of H (s, t) 
when averaged over all s. We need to solve a diffusion 
problem where at time n = a walker is equi-distributed 
amongst the N — 1 nodes s (t ^ s) and if the walker 
hits node t it gets absorbed. If one denotes by F^ n ' the 
probability flux into node t at step n, then the hitting 
time averaged over all s is given by the first moment of 
n distributed as F". 

In the neighborhood of t, the graph is a Cay ley tree 
with probability one at large number of nodes and 
thus does not depend on t in the large N limit. Given 
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the diffusion-absorption process, the vector of probabili- 
ties quickly converges to the dominant eigenvector of the 
master equation (that with the largest eigenvalue, decay- 
ing the slowest). In the limit of large N, the decay rate 
goes to zero and all the transient behavior (associated 
with the other eigenvectors) becomes irrelevant. When 
N — > oo, it is then enough to determine the dominant 
eigenvector, imposing zero boundary conditions at the 
root node t (labeled hereafter) and 1/(N — 1) bound- 
ary conditions for the far away nodes. 

As N — y oo, the recurrence equation that is satisfied 
by the eigenvector's elements leads to dA^+x = Ak+2 + 
(d — l)Ak where Ak is the sum of the probabilities on 
the nodes that are at distance k from the root node. 
Solving this, subject to the normalization and boundary 
conditions, leads to the value of A\ and thus the flux 
flowing into the absorbing node: F — A\jd. 

Note that since at large ./V only the leading eigenvec- 
tor matters, the first passage time is exponentially dis- 
tributed with a mean given by the inverse of this flux. 
This then gives for random d-regular graphs a hitting 
time behaving at large N as 
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Finally, it is worth noting that for random <i-regular 
graphs, with probability 1 in the large N limit, the ratio 
H(s,i)/N does not depend on the starting node s. Also, 
because of the regularity of the graph, this quantity does 
not depend on t either. 

IV. Probability of return on random d-regular 
graphs 

On any finite graph, a walker leaving node s will return 
with probability one. Nevertheless, if one considers the 
distribution of return times for increasing values of N, 
one will find that there is a N — * oo limiting point-wise 
distribution but which does not integrate to 1 . Indeed, in 
that limit, the return times will be finite with probability 
f and will diverge linearly in N with probability 1 — f. If 
f 7^ 1, the walk is transient. On the infinite Cayley tree, 
f can be computed simply by using the homogeneity of 
the graph as follows. 

Take s to be the root of an infinite Cayley tree. The 
walker must make a first step; let it be to one of its 
neighbors s' . Define r as the probability for the walk to 
return to s given that it has stepped to s'. Using the 
equivalence of all nodes, one can write a series for r: 
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where d > 2 is the degree of the Cayley tree. In this 
series, the term of 0(r k ) corresponds to the probability 
that the walk returns k times to node s' before going back 
to the root s. Summing this geometric series gives two 
possible values: r — 1 and r — l/(d— 1). Furthermore, 
it is easy to see that f = r. If d = 2, we have a one 



dimensional walker and f = 1. 
transient and f = l/(d— 1). 



For d > 3, the walk is 



V. Probability of return on Erdos-Renyi random 

graphs 

Here we extend the previous calculation of return prob- 
abilities to the case of Erdos-Renyi graphs. Just as for 
the random d-regular graphs, we exploit the fact that 
with probability 1 in the large N limit the neighborhood 
of a node belonging to a sparse Erdos-Renyi graph is lo- 
cally tree-like. We denote by c = (d) the mean degree 
of these graphs; the probability to have a node of de- 
gree d is P(d) = e~ c c d /dl, i.e., is given by the Poisson 
distribution. 

To find the probability to return in a finite number of 
steps (at large N) for a walker starting on the root node 
(hereafter referred to as 0), we reconsider the series of 
Eq. jnj. Suppose that at the first step the walker moves 
to the neighbor j of the root node, and that dj is the 
connectivity of that node. If the walker is to return to 0, 
it can do so immediately, or it can perform k loops from 
j (avoiding 0), stepping back to only after its (k + l)th 
visit to node j. By a loop from j, we mean a step to one of 
the dj — 1 neighbors of j other than 0, then a finite number 
of steps that do not visit j, and then finally a return to 
j. The point is that in our system the walker cannot 
come back to other than through the edge connecting 
j to 0: any other route requires going to "infinity" and 
thus an infinite number of steps. (Since we are dealing 
with a return probability on an infinite graph, the walks 
returning to must have a finite number of steps.) 

For the edges connecting node j to a node other than 0, 
let the return probabilities be r j(2), • ■ ■ Tj(dj — 1). 

Given these TjS, the probability rg to return to the root 
node if the walk's first step is to node j is 
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However, the rj(m) are i.i.d. random variables belonging 
to a distribution p(r). In the Erdos-Renyi ensemble, 
connects to a random node (j here) which itself connects 
to other random nodes. The distribution of ro is thus the 
same as that of the rjs, and Eq. Q determines implicitly 
a self-consistent functional equation for p(r). This can be 
written formally as: 

P( r ) = ^2 p ( z ) j ' dr i ■■ ■ J dr z p(r 1 ) . . . p(r z ) 



z=0 



S{ 
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where P(z) is the Poisson distribution (of z = dj — 1), 
and 8{x) is the Dirac delta function. Also, note that 
in this formula the z = term must be interpreted as 
P(0)S(l-r). 

We have solved for p by numerical iteration, demand- 
ing a stable distribution. Because p has both a continu- 
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FIG. 1: The probability density of the return probability r 
when stepping from a given node to just one of its neighbors 
on an infinite Erdos-Renyi graph with mean degree 3. For 
ease of presentation, the delta function contribution at r = 1 
has been removed and the rest has been rescaled to have a 
total probability of 1. Inset: zoom revealing the structure on 
a finer scale. 



ous part for < r < 1 and a delta function part at r = 1 , 
it was necessary to treat these two parts separately. We 
find that the convergence in the number of iterations is 
quite fast but that the distribution is not differentiable, 
forcing one to use very small bins in r. For illustration, 
we display in Fig. [1] the distribution p(r) when the mean 
degree is 3. Note the highly irregular nature of this distri- 
bution, which is nevertheless continuous. It also exhibits 
some degree of self-similarity; for instance, one motif, 
namely the distribution for < r < 0.5, is repeated 
at larger values of r but with a smaller amplitude and 
with some distorsion. We also present a zoom of the dis- 
tribution in the inset to illustrate the fact that p(r) is 
structured on all scales. 

Note that the intensity A of the Dirac part of p gives 
the probability for the first step of the walk to connect to 
a finite part of the graph. It is thus simply given by 
the solution to the equation A = Y^T=o -P(-z)A z , obtained 
by forcing the node j to have all its neighbors in a finite 
part of the graph also. In such a situation, one has r = 1. 

VI. Hitting times on Erdos-Renyi random graphs 

To compute the hitting time H(s,t), we take s and t 
to be on the same connected component whose size we 
denote by N^. For Erdos-Renyi graphs, we work beyond 
the percolation threshold, c > 1, on the "infinite" compo- 
nent, so iVoo ~ (1 — A)JV. With probability 1, the hitting 
time H(s,t) scales with N, has negligable fluctuations 
with s, and depends only the neighborhood properties 
of t. We thus focus on H(t), the mean of H(s,t) when 
averaging over all nodes s distinct from t. This problem 
has been solved for dense Erdos-Renyi graphs and leads 
to H(t) — N + o(N) For the sparse case, no ex- 

act treatment has been proposed, but a mean-field like 
approximation gives rather good results [T3 |. We now 
provide an exact mathematical approach. 



FIG. 2: The probability density of H/N^, on Erdos-Renyi 
graphs with mean degree 4, in the large graph size limit. H 
is the hitting time of walks residing on the graph's infinite 
(percolating) component and absorbed at a random node t; 
Noo is the size of that connected component. 



As explained previously, we can follow the probability 
of finding the walker on any node. The initial condition 
is that every node except t is occupied with the same 
probability l/(Noo — 1). The absorption at node t, here- 
after labeled 0, imposes Vq = at all times. The master 
equation for this process is therefore 



fn+l) 



[TAD-\ {n) 



(5) 



where = %(1 - 5 0i ) 



Denote by S the leading eigen- 
vector of the diffusion operator AD -1 having no absorp- 
tion, with eigenvalue 1. For a normalisation of the prob- 
abilities to 1, one has = di/(N 00 (d} 00 ) where di is the 
degree of the i-th node. Furthermore, {d)oo is the mean 
degree on the connected component considered, which 
in our case is not c because we have the constraint of 
belonging to the infinite component, instead it is 



(d)c 



Er^(l-A*)P(z) 
E^(1-A')PW 



(6) 



It is easy to check that under evolution without absorp- 
tion S is unchanged: since the walk is on a connected 
component, this is the only normalized steady state dis- 
tribution. We now introduce the vector b' n ) that repre- 
sents the difference between the vector S and the vector 

V («) ; 



J_, (n) _ 1 d,j 
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The absorption condition at then imposes u 
rfo/(rf)oo for all n. Far away from the root node, the 
distribution quickly relaxes to the leading eigenvector of 
the diffusion equation. In the TVqo — > oo limit, almost all 
nodes are oblivious to the absorption, so we can compute 
the hitting time by assuming that vf is equal to S m for 
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all nodes m at "infinity" , which gives us the boundary 

(n) 

condition h m = at all times. 

Now we can interpret the evolution equation for 
as describing a process of multiple random walkers dif- 
fusing on the graph, with in addition a fixed source at 

the root node. Specifically, at each time step n, new 
walkers are created at the root and step away while any 
walkers incoming to the root are removed from the sys- 
tem. With increasing number of iterations, the vector 
converges to a steady-state b (as v' n ) converges to 
v, a leading eigenvector of TAD^ 1 ) in which for each 
edge (Oj) connected to the root node, there is an outgo- 
ing flux of l/(<i)oo and a corresponding incoming flux of 
fj/(d)oa where rj is the probability of return to of a 
walker given that it has stepped to j. The flux into bo is 
then equal to the flux of "returning" random walkers: 
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Coming back to the the formalism based on v, i.e., the 
leading eigenvector of TAD^ 1 , the net total flux F(t) 
into the absorbing node is given by 



F(0) = V iv,. 

<j0> J 



(9) 



Using Eqs. ([7]) and © one obtains the final expression 
1 



F(0) = 



(10) 
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In the previous section we derived the distribution of 
rj, from which one easily obtains the distribution for 
H(0) = 1/F(0). First, for each value z > 1 of d (the 
degree of the root node), we compute the distribution 
of F(0). The delta function part of this distribution (at 
F(0) = 0) is removed and the remaining distribution if 
rescaled to have norm 1. This corresponds to enforcing 
the constraint that the absorbing node is on the infinite 
component of the Erdos-Renyi graph; the part of the 
distribution of F(0) which gives zero flux corresponds to 
being on a finite component. Second, the distribution of 
Hg = 1/F(0) is extracted: call it fi z (H z ). Finally, given 
all the distributions fj, z (1 < z < oo), the distribution 
of H at a random node is obtained by averaging the \i z 
with their respective weights: 



H Z (g)P(z)(l-A') 

e; =1 pw(i-a^) 
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An example of such a distribution is shown in Fig. [2]when 
(d) = 4. Furthermore, the distribution of H also gives 
the distribution of first passage times since at large N, 
for each value of H, the first passage time n is distributed 
as exp (—n/H). Finally, to obtain the mean hitting time, 
it is enough to compute the mean of the distribution of 
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FIG. 3: Mean hitting times divided by Noo for Erdos-Renyi 
graphs in the limit of large graphs, as a function of the pa- 
rameter c = (d) equal to the graphs' mean node degree. iVoo 
is the size of the "infinite" component, iVoo w (1 — A)iV for 
graphs of JV nodes. 
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FIG. 4: Plot comparing numerical simulation with analytical 
results. The x axis shows the size of the largest connected 
component of the graph, the y axis shows the mean hitting 
time for such a component. 



H. We have done so and show in Fig. [3] the resulting 
values, normalized by Noo, as a function of the mean 
degree of the graphs. At large (d), the ratio converges 
to 1 with 0(1/ (d)) corrections: one recovers the dense 
graph result. Also, the behavior is very smooth and we 
find that it differs from the value when the degree does 
not fluctuate (the case of random d-regular graphs) also 
by 0(l/(d)). 

VII. Comparison with numerical simulations 

Fig. 0] shows the mean hitting times on the largest con- 
nected component of an Erdos-Renyi graph with mean 
degree (d) = 4. The estimation from Eq. (fTTj) is com- 
pared with values obtained from a numerical simulation 
in which we followed the probability vector v^™) as in 
Eq. (O . The mean hitting times were then averaged over 
multiple graphs. The errorbars are shown as well. We 
found that values determined from the simulations ap- 
proach their large N limit rather fast and that this limit is 
compatible with our analytical result, the relative differ- 
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ence being compatible with a 0(1/N) convergence. The 
same conclusion also holds in the context of random d- 
regular graphs (cf. Eq. ([2])). 

VIII. Discussion and conclusion 

We considered random walks on random graphs, focus- 
ing on two quantities: the distribution of hitting times 
and the probability that a walker will return to its start- 
ing point in a finite time. (The hitting time is the mean 
of first passage times.) We derived a way to calculate 
the large N behavior of these quantities on two families 
of random graphs, finding non-trivial and intricate dis- 
tributions associated with the discrete nature of possible 



neighborhoods of a node. Finally, we compared the calcu- 
lated results with numerical simulation and found excel- 
lent agreement, supporting the expectation that the loops 
in these graphs can be treated by appropriate boundary 
conditions on infinite trees. 
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